Midwestern University¶RT incredibly low rate of success, has risk to the provider, resource intensive (especially at a smaller/ resource scarce hospital), can zap the blood bank, not a common procedure in general and as a result many practitioners do not have large volume of practice both performing and orchestrating team to perform this procedure efficiently and as effectively as possible
## We proposed investigating whether there is a substantial increase in mortality based on age and then determining an upper (and possibly lower) limit above (or below) which RT would be contraindicated.
The National Trauma Data Bank story for emergency department thoracotomy: How old is too old?
ICD: 34.02 - Exploratory Thoracotomy15 minutes of arrival to the EDPenetrating or Blunt2008-2012 from NTDB were used¶N = 2,58592.8%%92.3%94.9%99.4% (N= 140; There was a single individual >60 y/o who survived)80% (N=20)A recent study investigating penetrating cardiac injuries in NTDB devel- oped a predictive model for outcomes with a predictive power of 93% that was more robust than its previous counterparts. Advances in machine learning, predictive analytics, and increased standardization of large databases such as the NTDB will allow us to better utilize evidence-based medicine for critical decision- making and improved patient outcomes.
2007 - 2022 were requested and received from the NTDB¶.csv format was used for this projectPython 3 along with various packages, namely pandas and numpyINC_KEY or Incident Key which is assigned to each patientText out of the integer based labeling system used by certain categorical datafields within the data.pdf) that describes what a lot of the data fields are, what file they can be found in, what years they were actively used and if they are renamed from previous years so you can map this data back to those
2016 - 2017 along with transition from ICD9 to ICD10, in which the word Thoracotomy does not exist (there is an entire paper in which some very upset indivudals complained about the implications of this).¶2016, with the exception of the massive changes that happen during 2017.2007 - 2016 data range to play nicely with anything from the 2017 - 2022. I've tried many methods to both automate and borderline manually combine this data with the newer data, however, for at least 10 large technical reasons and to little avail, I have not been successful in this endeavor (at least not entirely)2007-2022) I can seemingly get it to mesh reasonably well with the newer data, HOWEVER, when I plot the mortality rates of the data in the 2007 - 2016 range, it is excessively low (~30-50%). Clearly there is something that is wrong with my filtering of the data that is not capturing all of the deceased cases or is including an excessive number of survivor casese that should realistically not be included in this calculation. Going forward, I will continue to attempt to tinker and resolve this problem as I think it would approximately double the sample size of what is ultimately used (at this time) for this project (~7-8k).
ICD10 codes anymore?¶https://www.icd10data.com/), and helpful input for ideas to look for from Dr. Schlanser and Julian Henderson, we tracked down and isolated a list of ICD10 procedures that in the context of the ED and being performed within the first 15-20 minutes of arrival at the ED, would presumably only be done in the context of an RT (and also excluded all sternotomy procedures) were able to isolate a lot of patient's who likely underwent RT.2013 there were a new subset of fields that had been sneakily introduced that included Hemorrhage Control Surgery along with the times following arrival at which they were subsequently performed. Thoracotomy, as well as Sternotomy (exlcuded) amongst other damage control surgeries. The listed procedure is supposed to be the initial procedure performed, so if Thoracotomy is listed, this does not rule out that a Sternotomy was not done but it implies that it was not the first procedure or the primary one. Aside from the initial damage control procedure listed, there are no additional fields detailing if and what any subsequent damage control procedures were performed (i.e. if another damage control procedure was done first and then a thoracotomy was done after this, we would not know that information and this patient would be excluded)Hemorrhage Control Surgery labels and relevant ICD10 codes and only counting individuals who had multiple ICD10 codes once, I obtained a unique list of INC_KEY, grabbed all of the patient data on them from 2017 - 2022.Thoracotomy as their damage control procedure but did have ICD10 procedures that were suggestive of a RT, I selected these patients (many of which had multiple of these procedures which is not surprising; getting intracardiac epi and a left visual inspection of the heart, open approach makes sense) and then selected the minimum time out of their RT related procedures as the time to RT.
Sternotomy as indicated by ICD10 codes or by the damage control label.
Thoracotomy being used as a damage control procedure and exclude all other cases from the study (maintain the purity of the data at the cost of sample size)
OR
Include all of the ICD10 code Thoracotomy related procecures but that are not explicitly labeled a Thoracotomy by the damage control surgery label (current implementation, more samples but possibly lower purity).
ICD10 code data makes processing more complicated (although once it's done you don't have to think about it again if done correctly in the initial stages of the project, which I think I have done but again, more places to make mistakes with this route). You are also adding the possibility that the individual did not have a Thoracotomy done at all into the realm of possibility, although for these individuals certainly can't be excluded.
Hours instead of Minutes and as a result these cases have times that were converted to minutes by me prior to filtering out by timeLoyola paper, but I imagine it depends on how deep the weeds you want to get.Beer's Law in which you would create a serially diluted standard solution with known concentrations and you would use that with your spectrometer to fit a line for predicting the concentrations of an unknown solution.y = mx + bx), there are many: y = a1b1 + a2b2 + a3b3 ..... anbn + c where an is a weight (very much like m from above) times the value of a feature, b1.machine part of machine learning is whatever algorithm you are using to learn relationships about your data to tune your weights (an) so that the sum of all of these paired terms and the intercept, c, equal a prediction. (This is incredibly reductionist, it's much more painful to learn than this, I don't want to go there and neither do you (but we could) but it is beyond the scope of right now) Survived/deceased? = (a1 x EMS SBP) + (a2 x EMS Pulse Rate) + (a3 x EMS Respiratory Rate) + (a4 x EMS Total GCS) + c, where the values an are your weights and c is an intercept determined by the algorithm to fit your data.penalty to the weights that dictates how big or small they are allowed to be or how many of them will be forced to equal 0, you would have algorithms called RIDGE, LASSO, or ElasticNet (this one is basically a combination of both RIDGE AND LASSO and you tune a value to determine how much of each you want).LASSOXGBoost, which I have used in the app as well and it performs the best out of all of these.v0.0 is pretty straightforward, you input all of the EMS data after selecting which algorithm you want to use and click Predict at the bottom and it will make a prediction along with returning the Confidence which is the probability with which the algorithm thought the outcome was the right answer.
0.11 for the class 0 (Survived) and 0.89 for the class 1 (Deceased). Then it selects the higher of these two values to make it's decision. The value 0.89 converted to a perctage is the Confidence that is the output along with the predicted class.v0.0 is simple and effective but has some drawbacks.v1.0 which allows you to select whether you want to use information from EMS, ED or Both settingsconfidence, I would like to find a better metric to use that incorporates the number of input datafields as a parameter)tuning the hyperparametershyperparameters, train the models on a subset of the data (which can not have missing data in it) to find the best hyperparameters, and then you take those and evaluate the model on one last held out group of data that the algorithm hasn't seen before. Dummy algorithm that is going to do exactly that: predict that the outcome is death no matter what the input is, and I will use this as a baseline for which my models should have to perform better than to determine that they are actually any good (as well as being better than random guessing or an AUROC >0.5)Mechanism of Injury); you just convert each unique instance of categorical variable to an integer and then save a dictionary with these mappings that you use to translate back their meaning on the back end if you need to.285k different models to make this application work (I used to have access to a super computer (MSU HPCC) which was basically the equivalent of 50k desktops put together that you could command at will to do massive amounts of code. ~50.5 hours and churned out around 40k of the models, during which time I thought it might combust) and the desktop that I built a few years ago (~8X computing power).v2.0 comes in, my PC was needed so that my laptop wasn't running this code for 10+ days in a row. Turns out MacOS and Windows do not play nice when sharing your code between the two systems (I've never done before and will hopefully never do again), so the v2.0 is still in progress and running the models as we speak.v1.0 that is predicting the same example patient I presented earlier as having a 99% chance of survival (we all know that's not how that one is going to pan out, sorry kid) so I need to figure that out, although by the time v2.0 finishes, this may not be necessary at all.The last but a certainly not least point that is to be made is that currently I am paying a $10 per month service to be able to host the website, however, this is only viable in the short term as it is still basicall being hosted from the Terminal in my laptop, so if my laptop is not running the code for the app, then the app does not exist online (problem)
Amazon Web Services (AWS)) that way it can be up all the time